# 💡 Thought Purity

## 🚀 Quick Start

### 📊 Data Process

1. **Download datasets** under `/TP_data_process/clean_data/`:
   - [Letter](https://papers.nips.cc/paper_files/paper/2022/hash/9d5609613524ecf4f15af0f7b31abca4-Abstract-Conference.html)
   - [CSQA](https://aclanthology.org/N19-1421/)
   - [GSM8K](https://arxiv.org/abs/2110.14168)
   - [Strategyqa](https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00370/100680/Did-Aristotle-Use-a-Laptop-A-Question-Answering)

2. **Run scripts** under `/TP_data_process/process_COT/`
3. **Run scripts** under `/TP_data_process/backdoored_data/`
4. **Run scripts** under `/TP_data_process/labeled_data/`
5. **Check materials** under `/TP_data_process/grpo_meterial/`

### 🤖 Reinforcement Learning

1. **Download base model**
2. **Run scripts** under `/TP_GRPO_train/`
3. **Check new models** - The new weights will exist in the form of safetensors under your checkpoint

### 📈 Evaluation

- You can complete the evaluation of the model under BadChain attacks in [BackdoorLLM-CoTA](https://github.com/bboylyg/BackdoorLLM/tree/main/attack/CoTA)


### ⚠️ Important Considerations

#### 📝 Data Handling
- Pay attention to the **format conversion** and **address changes** of the data
- Please check the **prompts** during training and testing
- Adjust the **hyperparameters** according to your hardware
- The number of data examples will **increase training time**

#### 🔧 RL Algorithm Specifics
- RL algorithms exhibit **inherent instability** - formulate RL prompt styles according to your base model
- **Monitor training stability** and convergence issues that may arise with RL algorithms

#### 🔄 Data Pipeline Flexibility
- The sample data represents **only one variant** - you can follow the provided scripts to:
  - Insert different triggers at various positions
  - Adjust input content while ensuring **no serious data leakage** occurs

#### 🛠️ Model Configuration
- Most LLMs require **manual or automatic modification** of model config files when inserting labels
- Pay special attention to:
  - `tokenizer_config` file
  - `added_tokens` or `special_tokens_map` related JSON files generated after model creation

#### 💻 Hardware Compatibility
- Our code uses parameters that can run on a **local single GPU**
- Model integration methods are **compatible with general LLMs**



## 🙏 Acknowledgement

We thank the authors of the following repositories for their excellent work:

- [BadChain](https://github.com/Django-Jiang/BadChain)
- [BackdoorLLM](https://github.com/bboylyg/BackdoorLLM)